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ABSTRACT 


It  is  a  common  practice  for  people  involved  in  research 
to  take  note  of  books,  articles,  reports  and  other  materials 
relevant  to  their  work.  As  a  collection  of  such  references 
grows,  it  becomes  more  and  more  difficult  to  locate  specific 
ones,  so  that  some  form  of  retrieval  system,  however  simple, 
is  needed.  This  thesis  examines  the  character istics  and  use 
of  a  personal  document  collection,  then  semi-quantitatively 
evaluates  retrieval  systems  presently  available  to  a 
researcher.  Finally,  a  simple  alternative  system,  based  on  a 
computer  program  providing  access  to  records  stored  in  a 
sequential  file,  is  described. 
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I.  INTRODUCTION 


As  the  body  of  literature  in  a  researcher's  field 
grows,  it  is  likely  that  he  will  begin  to  keep  a  file  of 
references  to  material  that  interests  him.  The  problem  of 
providing  access  to  such  a  file  is  one  that  has  been 
recognised  for  a  number  of  years,  and  becomes  more  acute  as 
the  file's  size  increases.  Several  retrieval  systems  have 
been  adapted  for  use  by  an  individual  researcher;  these 
range  from  traditional  author/title/subject  card  systems  to 
relatively  complex  ones  involving  extensive  subject  analysis 
and  subsequent  coding  of  information.1  More  recently, 
computer-based  systems  have  been  described.2 

This  thesis  will  first  examine  the  nature  and  use  of 
personal  collections,  based  on  a  survey  of  the  literature.  I 
will  then  analyse  the  traditional  manual  retrieval  systems, 
as  used  by  an  individual  researcher.  Next,  the  application 
of  modern  data  base  management  systems,  such  as  SPIRES 
(Stanford  Public  Information  REtrieval  System) ,  RIQS  (Remote 


iGerald  Jahoda.  Information  storage  and  retrieval  systems 
for  individual  researchers.  (New  York:  Wiley-Interscience, 
1970) 

A.  H.  Foskett.  A  guide  to  personal  indexes  using 
edge-notched,  Uniterm  and  peek-a-boo  cards.  2nd  ed.  (London: 
Clive  Bingley,  c1970) 

2Peter  Leggate  et  al.  "An  on-line  system  for  handling 
personal  data  bases  on  a  PDP  11/20  minicomputer."  Aslib 
Proceedings,  29(2):56-66,  February  1977. 

Benjamin  Mittman  and  Lorraine  Borman.  Personalized  data  base 
systems.  (Los  Angeles:  Melville  Publishing  Company,  c1975.) 
312  p. 
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Information  Query  System) ,  and  FAMULUS ,  to  this  problem  will 
be  discussed.  Finally,  an  alternative  system,  based  on  a 
PL/1  program  providing  weighted-term  searching,  will  be 
described. 

1  assume  that  the  reader  is  familiar  with  the 
characteristics  and  use  of  conventional,  coordinate,  and 
edge-notched  card  systems,  and  therefore  the  section  dealing 
with  these  is  brief.  Jahoda3  describes  their  use  in  detail, 
and  should  be  referred  to  if  necessary  for  clarif ication . 

Since  I  am  assuming  that  a  personal  retrieval  system 
will  be  primarily  the  work  of  one  person,  who  has  little 
time  to  spend  on  its  development,  this  thesis  will 
concentrate  on  the  procedures  used  to  store  and  subsequently 
to  retrieve  information,  rather  than  on  the  recall  and 
precision  of  the  system.  Techniques  for  improving  system 
performance  in  these  areas  have  been  described  previously4; 
since  these  involve  greater  depth  of  subject  analysis  and/or 
vocabulary  control,  they  will  therefore  increase  the  time 
taken  to  add  new  references.  The  degree  to  which  such 
measures  are  necessary  should  be  determined  by  the  user  of 
the  system,  according  to  his  own  particular  needs  and 
standards. 

I  make  no  attempt  to  provide  a  quantitative  analysis  of 

the  time  taken  to  set  up,  alter,  and  use  each  system,  since 

it  will  obviously  vary  from  person  to  person,  and  will  also 

depend  on  the  depth  of  indexing  done. 

3Jahoda,  Information  storage  and  retrieval  systems. 

4roskett,  A  guide  to  personal  indexes,  p.14-19. 
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I  also  do  not  discuss  the  physical  arrangement  of  the 
items  the  researcher  owns,  since  I  feel  that  any  logical 
arrangement  will  be  satisfactory.  Once  the  searcher  finds 
that  he  does  own  a  particular  book,  or  has  a  copy  of  a 
specific  article,  he  will  probably  know,  at  least  roughly, 
where  it  is  kept. 


. 


II.  PERSONAL  REFERENCE  COLLECTIONS 


In  order  to  determine  the  retrieval  needs  of  a 
researcher,  it  is  first  necessary  to  examine  the 
characteristics  and  use  of  a  personal  reference  collection. 
This  will  be  done  by  means  of  a  survey  of  the  literature. 


A.  Characteristics 

little  formal  work  has  been  done  to  identify  the  nature 
of  personal  document  collections,  perhaps  because  they  vary 
too  greatly  for  any  meaningful  conclusions  to  be  drawn.  The 
main  study  is  that  done  by  Burton5,  who  found  that  such 
collections  generally  had  well-defined  patterns  of  input 
from  a  variety  of  sources,  and  tended  to  concentrate  on 
primary  literature.  Additional  information  about  personal 
reference  collections  can  be  found  in  the  literature 
describing  the  ways  in  which  researchers  cope  with  the 
"information  explosion".  Examples  of  such  articles  are  those 
by  Cushing6  and  Hoff7.  More  recently,  researchers  have 


5Hilary  D.  Eurton.  "Personal  documentation  methods  and 
practices  with  analysis  of  their  relation  to  formal 
bibliographic  systems  and  theory."  Microform  Ph.  D.  thesis. 
University  of  California  at  Berkeley,  1972. 

6Balph  Cushing.  "A  fresh  look  at  improving  personal  filing 
systems."  Chemical  Engineering,  70(1):73-88,  January  7, 
1968. 

7Wilbur  J.  Hoff.  "A  retrieval  system  for  health  education 
information."  International  Journal  of  Health  Education, 
9(2) : 87-93,  June  1966. 
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described  simple  computer-aided  systems. Bridges8  describes  a 
program  vhich  produces  printed  indexes,  while  Jameson9 
presents  a  batch  retrieval  system.  It  is  interesting  to  note 
that  few  of  the  systems  described  in  these  articles  have 
been  developed  in  consultation  with  a  librarian  or 
information  specialist.  In  fact,  the  authors  emphasise  the 
independence  and  adaptability  of  their  systems. 

However,  it  is  possible  to  generalise  somewhat  about 
personal  reference  collections,  and  to  make  certain 
assumptions  about  their  nature.  First,  such  a  collection  is 
the  work  of  only  one  person,  and  reflects  his  idiosyncrasies 
and  interests.  It  includes  a  variety  of  material,  some  of 
which  is  actually  owned  by  the  researcher,  with  the  rest 
available  in  the  library,  or  perhaps  in  the  collection  of  a 
colleague.  The  growth  of  the  collection  is  erratic, 
depending  on  a  number  of  factors,  such  as  the  stage  of 
research  and  field  of  study.  Obviously,  when  a  new  project 
is  begun,  the  collection  will  grow  rapidly  as  background 
material  is  gathered.  When  the  research  is  well  underway, 
the  growth  may  slow  down,  since  the  researcher  is  not 
actively  collecting  information.  Then,  when  a  paper  or 
report  is  being  prepared,  the  collection  will  probably  be 
used  heavily  (and  possibly  augmented),  as  references  are 


8Kent  W.  Bridges.  "Automatic  indexing  of  personal 
bibliographies."  Bioscience,  20(2):94-97,  January  15,  1970. 

9David  L.  Jameson.  "Information  retrieval  for  the  working 
scientist:  a  simple  algorithm."  Bioscience,  1 9  (3) :  232-233, 
March  1969. 
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verified.  As  Yerke10  points  out,  the  relevance  of  a 
particular  document  will  also  depend  on  the  stage  of 
research,  since  new  discoveries  may  lead  to  new  hypotheses. 
These  characterist ics  have  implications  for  the  type  of 
retrieval  system  that  will  be  suitable;  these  will  be 
discussed  in  the  next  section. 


B.  Use  of  a  personal  reference  collection 

The  two  main  studies  of  the  ways  in  which  personal 

reference  collections  are  used  were  done  by  Jahoda, 

Hutchins,  and  Galford11  and  Burton12.  Both  found  that  the 

amount  of  information  about  an  item  included  in  such  a  file 

varied  from  person  to  person,  and  that  where  a  retrieval 

system  was  in  use,  the  access  points  also  varied.  However, 

the  minimum  information  included  was  author,  title, 

citation,  and  an  indication  of  location;  the  most  common 

access  points  used  were  subject,  author,  and  source 

(including  date) .  Searches,  when  done,  had  a  range  of 

complexities,  but  usually  involved  one  to  four  concepts. 

Jahoda  also  studied  the  correspondence  between  title  and 

subject,  finding  that  title  words  alone  could  be  used  to 

locate  relevant  documents  for  over  half  the  questions  asked. 

10Theodor  B.  Yerke.  "Computer  support  of  the  researcher* s 
own  documentation."  Datamation,  16  (2): 75-78,  February  1970. 
^Gerald  Jahoda,  R.  D.  Hutchins,  and  R.  R.  Galford. 

"Analysis  of  case  histories  of  personal  index  use."  American 
Documentation.  Proceedings,  4:245-254,  1966. 
idem.  "Characteristics  and  use  of  personal  indexes 
maintained  by  scientists  at  one  university."  American 
Documentation,  17(2):71-75,  April  1966. 

12Burton.  "Personal  documentation  methods  and  practices." 
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The  preceding  discussion  has  implications  for  the 
design  of  a  retrieval  system  for  use  by  individual 
researchers.  In  fact,  the  standards  first  described  by 
Krieger13  still,  for  the  most  part,  apply.  According  to 
Krieger,  a  personal  retrieval  system  should: 

1.  find  all  information  sought 

2.  allow  for  subjects  not  originally  included 

3.  reduce  duplication  of  entries  to  a  minimum 

4.  involve  as  little  coding  as  possible 

5.  avoid  the  use  of  complex  sorting  or  punching  machinery 
A  few  more  comments  are  necessary.  First,  because  such  a 
system  will  be  established  and  maintained  by  only  one 
person,  the  time  reguired  to  set  up,  update,  and  use  it 
should  be  kept  to  a  minimum,  yet  the  system  should  achieve 
the  desired  results.  It  should  be  able  to  accommodate  a 
variety  of  information,  including  author,  title,  citation, 
and  location.  Searching  for  a  number  of  concepts  must  be 
possible,  and  combinations  of  author,  subject,  and  date  must 
be  searchable.  As  Yerke  has  pointed  out,  some  documents  may 
become  obsolete  or  irrelevant,  so,  in  addition,  the  system 
should  allow  for  the  removal  of  a  record  when  it  is  no 
longer  required. 

To  summarise,  then,  a  retrieval  system  designed  for  use 
by  an  individual  researcher  should: 

1.  have  a  flexible  input  format 

1 3 K.  A.  Krieger.  "A  punched  card  system  for  chemical 
literature. "  Journal  of  Chemical  Documentation, 

26  (3)  :  1 63-166,  March  1949. 
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2.  be  easy  to  use,  and  not  involve  extensive  coding  or 
checking  of  subjects 

3.  allow  additional  subject  coverage 

4.  allow  records  no  longer  wanted  to  be  removed 

5.  avoid  duplication  of  entries 

The  above  criteria  will  now  be  used  to  evaluate  methods 
which  have  been  suggested  for  use  by  individual  researchers. 
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III.  MANUAL  SYSTEMS 


The  main  card-based  systems  described  for  use  by 
researchers  are 

1.  the  traditional  author/title/subject  system,  similar  to 
those  found  in  most  libraries 

2.  coordinate  indexes,  including  both  feature  card  indexes, 
such  as  the  Uniterm  and  optical  coincidence  systems,  and 
edge-notched  card  systems 

Each  of  these  will  be  briefly  described,  then  analysed  to 
determine  whether  they  meet  the  requirements  discussed  in 
the  previous  section. 


A.  Conventional  Systems 

Conventional  systems  are  based  on  a  set  of  subject 
headings,  consisting  of  words  or  phrases.  These  may  come 
from  an  existing  list  or  thesaurus,  such  as  the  EPIC 
thesaurus,  or  the  researcher  may  prefer  to  compile  his  own 
list.  When  an  item  is  added  to  the  collection,  terms  which 
describe  its  content  are  selected  and  a  card  made  up  for 
each  access  point.  The  cards  are  then  kept  in  alphabetical 
order  by  subject  heading.  In  addition,  author/title  entries 
may  be  made,  with  an  entry  for  each  author  and  also  for  the 
title.  The  amount  of  information  on  each  card  may  vary,  but 
there  are  two  main  forms. 

In  the  first,  complete  information — i.e.  author,  title, 
citation,  and  location — appears  only  on  the  main  author 


9 


■ 


' 


■ 


■ 


a  \  J  a  >  . 


■  h| 

. 


10 


card;  the  rest  of  the  cards  just  give  author  and  title.  This 
means  that  subject  searching  is  a  two-step  process.  First 
relevant  subject  headings  are  consulted.  Then,  the 
information  needed  to  find  the  items  wanted  is  obtained  from 
the  author  file. 

The  other  form  of  the  conventional  card  system  involves 
repeating  the  full  information  about  an  item  on  each  card 
relating  to  it.  Searching  is  much  simpler,  but  adding 
references  is  more  repetitive,  and  also  more  time-consuming. 

Obviously,  such  a  system  has  both  advantages  and 
disadvantages  for  the  researcher  who  wishes  to  use  it.  The 
main  advantage  is  the  flexibility  of  the  subject  headings. 
The  researcher  can  be  as  specific  as  he  likes  when  indexing 
items  for  retrieval,  simply  by  modifying  the  phrase  for  the 
main  subject.  This  can,  however,  also  be  a  disadvantage, 
since  it  may  be  difficult  to  predict  which  aspects  of  an 
article  will  be  of  interest  in  the  future.  To  re-assign 
subjects  often  would  result  in  an  extremely  large  file  of 
subjects,  and  merely  to  discard  cards  no  longer  wanted  would 
represent  a  waste  of  effort.  Another  disadvantage  to  the 
researcher  is  the  difficulty  of  searching  for  a  combination 
of  subjects  not  represented  by  one  subject  heading.  Jahoda14 
found  that  such  a  conventional  subject  index  works  well  when 
there  are  three  concepts  or  fewer  per  search,  something  not 
necessarily  true  in  the  case  of  the  researcher,  who  is 

*4Gerald  Jahoda.  "A  technique  for  determining  index 
requirements. M  American  Documentation,  15(2):82-85,  April 
1964. 
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likely  to  have  a  specific  question  to  be  answered.  In 
addition,  there  is  considerable  duplication  of  effort,  both 
in  compiling  the  index,  and  in  using  it.  First,  information 
is  repeated  on  each  card,  at  least  to  some  extent.  Then,  the 
researcher  must  spend  time  sorting  and  interfiling  new  cards 
into  the  existing  index.  While  the  actual  amount  of  time 
spent  will  vary  with  the  depth  of  subject  analysis  done  and 
also  with  the  growth  rate  of  the  collection,  it  is  clear 
that  much  of  the  time  spent  in  organising  such  a  system  is 
clerical  in  nature,  and  probably  does  not  represent  a 
valuable  expenditure  of  a  highly  trained  person’s  time.  On 
the  other  hand,  such  a  mindless  activity  may  sometimes  have 
therapeutic  value  particularly  if  the  researcher  is  dealing 
with  a  stubborn  problem. 

The  conventional  card  system,  then,  does  not  meet 
criterion  number  5,  since  it  involves  a  fair  amount  of 
duplication.  If  sufficient  time  and  effort  is  spent  in 
compiling  such  an  index  to  a  collection,  the  system  is  a 
good  one,  as  libraries  show.  However,  for  the  researcher  who 
wants  a  simple  and  efficient  system,  it  is  definitely  not 


suitable. 


IJ.  J  i 

' 


. 

. 


j  ! 


' 


12 


B.  Coordinate  Indexes 

Coordinate  indexes  provide  a  method  of  searching  for  an 
almost  unlimited  number  of  subject  terms,  or  descriptors,  as 
they  are  commonly  called.  Such  systems  are  found  in  two 
forms:  the  feature  card  system,  in  which  there  are  two 
files,  an  accession  file  and  a  descriptor  file,  and  the 
edge-notched  card  system,  where  there  is  only  one  file  of 
cards . 

Feature  Card  Systems 

This  type  of  index  appears  in  two  main  forms,  the 
"Uniterm”  system  and  the  "optical  coincidence"  system. 
Since  the  two  differ  mainly  in  the  nature  of  the  card 
and  equipment  used,  they  will  not  be  discussed 
separately,  but  will  be  described  in  a  more  general 
manner. 

As  I  have  stated  above,  feature  card  systems 
consist  of  two  files:  an  accession  file,  in  which  each 
item  included  is  given  a  unique  number,  usually  assigned 
when  it  is  first  located;  and  a  descriptor  file, 
consisting  of  a  card  for  each  term  used,  indicating  the 
accession  numbers  of  items  to  which  that  term  has  been 
assigned.  A  typical  accessions  card  will  give  author, 
title,  citation,  location,  and  list  the  descriptors 
assigned;  it  may  also  include  brief  comments  about  the 
item's  usefulness.  The  cards  must  be  kept  in  numerical 
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order.  Foskett15  suggests  keeping  it  in  a  book,  rather 

than  on  cards,  since  there  is  no  reason  to  disarrange 
the  entries.  In  the  optical  coincidence  version, 
pre-numbered  cards  are  used  for  the  descriptor  file,  and 
the  numbers  of  relevant  documents  punched,  so  that 
superimposing  a  number  of  descriptor  cards  will  result 
in  light  shining  through  the  locations  of  items  given 
those  descriptors.  In  the  Uniterm  version,  the  card  has 
ten  columns;  numbers  of  relevant  items  are  written  in 
the  column  of  their  final  digit.  Searching  is  done  by 
comparing  numbers  until  a  match  is  found.  In  addition, 
cards  can  be  kept  for  author  and  date,  to  provide  access 
through  these  points. 

Such  a  system  has  obvious  advantages  over  the 
previously  described  traditional  card  system,  since 
searching  is  much  simpler.  The  number  of  terms  used  as 
descriptors  is  virtually  unlimited,  though  a  large 
number  of  terms  results  in  a  vast  number  of  cards  to  be 
stored.  Since  the  descriptor  cards  are  kept  in 
alphabetical  order,  and  the  accession  file  in  numerical 
order,  a  great  deal  of  filing  may  need  to  be  done  if  the 
file  is  searched  regularly.  In  addition,  the  number  of 
documents  that  can  be  handled  by  such  a  system  is 
limited  by  the  size  of  the  card,  especially  in  the 
optical  coincidence  version.  They  are  generally  adequate 
for  up  to  a  thousand  items,  and  special  versions 


1 5 A .  C.  Foskett.  A  guide  to  personal  indexes,  p.  45. 
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requiring  expensive  precision  drilling  equipment  can 
handle  up  to  ten  thousand  documents.  However,  searching 
is  a  two-step  procedure,  and  the  need  to  keep  two 
separate  files  again  results  in  a  great  deal  of  clerical 
work,  so  that  a  busy  researcher  on  his  own  may  find  it 
difficult  to  keep  up. 

Removing  an  item  from  the  file  is  also  difficult, 
particularly  in  the  optical  coincidence  system,  since 
the  accession  record  and  any  occurrences  of  the  number 
must  be  removed  from  the  descriptor  file. 

Edge-notched  Card  Systems 

Edge-notched  cards  are  index  cards  with  holes 
punched  around  their  edges.  A  single  hole  or  combination 
of  holes  stands  for  a  particular  term;  when  this  term  is 
used  to  index  a  document,  the  hole (s)  is  (are)  punched. 
One  card  is  used  per  document,  and  holes  punched  out  as 
needed  to  indicate  the  descriptors,  author,  and  source. 
Searching  is  done  using  a  long  needle;  it  is  inserted  in 
the  location  of  the  term  of  interest,  and  documents 
where  this  location  has  been  punched  will  fall.  Other 
terms  are  then  searched  on  the  fallen  cards  in  order  to 
do  a  complex  search. 

The  holes  can  be  assigned  meaning  in  a  number  of 
ways.  The  simplest  is  to  use  one  hole  per  subject,  but 
this  limits  the  number  of  terms  to  the  number  of  holes 
on  the  card.  In  order  to  make  more  efficient  use  of  the 
card,  it  is  possible  to  use  what  is  called  indirect 
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coding,  which  means  that  two  or  more  holes  are  used  per 
descriptor.  The  most  common  is  the  random  superimposed 
code16,  in  which  each  term  is  assigned  two  random 
numbers;  the  holes  for  these  numbers  are  punched  when  a 
document  is  assigned  the  term,  and  more  than  one 
descriptor  is  punched  per  field.  Using  a  two  hole  code 
can  greatly  increase  the  number  of  terms  that  can  be 
used,  but  it  does  make  adding  new  records  and  searching 
more  complicated,  since  the  code  book  must  be  consulted 
each  time  to  determine  the  code.  It  also  increases  the 
number  of  files  that  need  to  be  kept,  resulting  in  more 
clerical  work  to  be  done. 

As  many  researchers  have  shown,  however,  an 
edge-notched  system  can  be  an  effective  means  of 
providing  access  to  a  collection  of  references.  It  does, 
however,  become  time-consuming  to  search  if  there  are 
more  than  a  few  hundred  references,  since  only  two 
hundred  or  so  can  be  searched  thoroughly  at  a  time. 


16  Gerald  Jahoda.  Information  storage  and  retrieval  systems 
for  individual  researchers,  p.  67. 
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C.  Conclusions 

Manual  systems  can  provide  an  effective  means  of  giving 
access  to  a  researcher's  collection,  particularly  if  he  has 
sufficient  time  to  spend  on  the  clerical  details.  However, 
there  is  considerable  duplication  of  information.  The  need 
to  consult  several  files  before  getting  the  results  of  the 
search  is  a  major  disadvantage,  making  searching  a 
complicated  procedure. 
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IV.  DATA  BASE  MANAGEMENT  SYSTEMS 


In  the  last  few  years,  data  base  management  systems 
have  become  more  and  more  common,  partly  because  of  the 
increasing  sophistication  of  computers.  They  are  capable  of 
handling  a  variety  of  information,  including  bibliographic. 
Some  examples  are  SPIRES  (Stanford  Public  Information 
REtrieval  System) ,  RIQS  (Remote  Information  Query  System) , 
and  TDMS  (Time-shared  Data  Management  System) .  These  are 
complex  systems  developed  to  meet  a  variety  of  needs,  and  a 
large  number  of  users.  In  this  thesis,  I  will  discuss  the 
use  of  SPIRES  to  handle  bibliographic  information,  since  it 
is  the  data  base  management  system  in  use  at  the  University 
of  Alberta.  I  will  not  concentrate  on  specific  examples  of 
its  use,  but  will  instead  describe  the  general  procedures 
used  to  create,  update,  and  search  a  data  base, 
concentrating  on  the  needs  of  the  researcher. 
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A.  Stanford  Public  Information  REtrieval  System 

SPIRES  is  a  data  base  management  system  developed  at 
Stanford  University  in  the  early  1970's.  It  was  originally 
designed  as  a  system  for  library  automation,  but  is  now  used 
for  a  variety  of  applications  17. 

In  order  to  use  SPIRES,  it  is  first  necessary  to  define 
the  file.  The  file  definition  process  consists  of 
determining  the  number  of  elements  and  characteristics  of 
each  element  in  a  record.  The  language  used  for  this  is 
fairly  complex,  and  it  is  unlikely  that  the  average 
researcher  would  have  the  time  (or  inclination)  to  become 
familiar  enough  with  it  to  define  his  own  file.  This  means 
that  he  would  have  to  hire  a  consultant  to  do  so,  and 
therefore  some  of  the  personal  aspect  is  lost,  since  an 
outsider  has  become  involved. 

The  researcher  will  probably  have  at  least  four 
elements  in  his  file  definition,  all  of  them  variable  in 
both  number  of  occurrences  per  record  and  also  in  length. 

The  elements  would  be  author,  title,  citation,  and  location. 
The  citation  could  also  indicate  the  type  of  material,  that 
is,  whether  it  is  a  book,  periodical  article,  report,  etc., 
in  addition  to  the  usual  information.  Alternatively,  one 
could  define  the  type  of  material  and  date  of  publication, 
asseparate  elements,  say,  and  SPIRES  can  easily  allow  this. 

Setting  up  or  adding  records  to  SPIRES  is  a  fairly 

complicated  procedure.  First,  the  information  is  placed  in  a 

1 7G .  R.  Jackson.  SPIRES/370  file  definition.  (Edmonton:  The 
University  of  Alberta,  1977)  p.5. 
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standard  sequential  line  file,  with  each  element  identified 
in  the  form  ”ELEMENT-NAME=data ; ” .  Once  the  information  is 
free  of  mistakes  (and  this  alone  can  involve  considerable 
effort) ,  it  is  added  to  the  data  base  using  the  SPIBILD 
processor.  This  is  a  relatively  expensive  procedure,  since 
the  data  is  compared  with  the  file  definition  to  make  sure 
that  it  is  in  the  correct  form.  In  addition,  indexes  in  the 
form  of  inverted  files  may  be  built  for  elements  that  are 
often  searched.  Indexes  add  to  the  cost  of  building  the  data 
base,  but  to  use  SPIRES  efficiently,  and  therefore  make 
searching  cheaper,  they  are  essential. 

Searching  is  done  using  Boolean  logic  to  combine  terms; 
the  cost  of  a  search  will  depend  on  both  the  size  of  the 
subfile  and  on  the  number  of  terms  used,  as  well  as  on 
whether  an  index  has  been  built  for  the  element (s)  searched. 
Searching  for  unindexed  elements  is  expensive,  since  the 
file  must  be  searched  sequentially  to  see  if  the  wanted  term 
or  value  occurs. 

Items  retrieved  from  a  search  are  displayed  or  typed  in 
the  reverse  order  of  input,  that  is,  with  the  ones  added 
most  recently  typed  first.  If  he  knows  the  SPIRES  formatting 
language,  the  researcher  can  also  vary  output  format 
according  to  his  needs.  This  feature  is  useful  if  the  output 
is  to  be  used  in  a  report  or  paper,  but  is  probably  not 
needed  by  the  searcher  who  merely  wants  to  know  where  a 
certain  reference  is  located,  or  what  he  has  collected  on  a 
certain  subject. 
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From  the  preceding  brief  discussion  of  the 
characteristics  of  SPIRES,  it  can  be  seen  that  using  a  data 
base  management  system  is  a  fairly  awkward  way  of  setting  up 
a  personal  system.  Creating  a  file  to  be  searched  is  a 
complicated  procedure,  and  storing  the  file  in  disk  storage 
takes  up  a  great  deal  of  space,  since  not  only  the  data,  but 
also  the  file  definition,  formats,  and  any  indexes  which  are 
created,  must  be  stored.  This  will  greatly  increase  the 
costs  of  maintaining  the  system,  since  as  the  data  base 
grows  in  size,  it  will  take  up  more  and  more  disk  space,  and 
the  owner  pays  for  the  amount  of  storage  used.  However,  once 
the  system  has  been  set  up,  searching  is  simple  and 
straightforward.  One  can  alter  existing  records  by 
transferring  the  complete  record  to  a  temporary  line  file, 
making  changes  using  the  operating  system  file  editor,  and 
then  re-processing  the  record.  An  entry  is  removed  by  simply 
deleting  it. 

Overall,  though,  it  is  apparent  that  SPIRES  in 
particular  is  not  suitable  for  the  researcher  looking  for  a 
simple  retrieval  system  to  help  him  keep  track  of  the 
literature  in  his  field.  It  is  interesting  to  note  that  none 
of  the  various  applications  described  by  Mittman  and 
Borman18  is  a  simple  personal  retrieval  system;  all  are 
complex  applications  making  use  of  the  ability  of  the  system 
to  do  much  more  than  merely  store  and  retrieve  information, 

18Benjamin  Mittman  and  Lorraine  Borman.  Personalized  data 
base  systems.  (Los  Angeles:  Melville  Publishing  Company, 
cl  97  5) 
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ar.d  are  in  fact  research  data  bases  containing  much  more 
than  simple  bibliographic  information.  Data  base  management 
systems  do  provide  an  excellent  means  of  providing  access  to 
a  variety  of  information,  and  can  analyse  and  plot  data  as 
well  as  retrieve  it.  They  are  also  suitable  for  creating  a 
data  base  for  use  by  many  people,  since  they  can  standardise 
data,  and  are  also  capable  of  providing  instruction19. 


19ibid.  p . 3-33 
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V.  AN  ALTERNATIVE  RETRIEVAL  METHOD 


This  chapter  describes  a  program  written  in  PL/1,  based 
on  a  weighted-term  algorithm  previously  described  by  Davis20 
The  algorithm  has  been  modified  to  allow  the  user  to  specify 
a  maximum  number  of  items  retrieved,  and  also  to  allow  him 
to  search  on  a  combination  of  elements.  Although  the  program 
described  here  has  been  written  in  PL/1,  it  could  be  adapted 
to  another  language  with  little  difficulty.  PL/1  was  chosen 
because  of  its  built-in  text- handling  capabilities.  The 
operating  system  used  is  MTS,  since  it  is  in  use  at  the 
University  of  Alberta,  but  any  modern  operating  system  would 
be  suitable.  A  program  listing,  sample  data,  and  a  sample 
search  are  given  in  the  appendices  to  this  thesis. 

A  full  description  of  weighted-term  searching  is  given 
in  the  article  by  Davis,  and  I  will  briefly  summarise  the 
main  points.  The  searcher  assigns  arbitrary  weights  to  each 
term  used  in  the  search.  The  weight  of  a  particular  record 
is  then  the  sum  of  the  weights  of  any  search  terms  occurring 
in  it;  if  it  is  greater  than  a  threshold  weight  determined 
by  the  searcher,  the  information  is  printed  out.  Obviously, 
the  search  can  be  planned  so  that  only  certain  combinations 
of  terms  will  make  the  document  weight  greater  than  the 
threshold,  and  this  allows  a  flexible  search  strategy. 

The  program  has  been  written  to  make  use  of  three 

20Charles  H.  Davis.  MA  simple  program  for  weighted  term 
searching."  Special  Libraries,  63  (9) : 381-384 ,  September 
1972. 
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fields,  which  can  then  be  used  for  any  data  element.  In  the 
example  I  assign  them  to  author,  title,  and 
citation/location.  Data  is  stored  in  a  standard  line  file, 
using  the  apostrophe  as  a  delimiter;  the  end  of  the  file  is 
indicated  by  three  *  ZZZ*  lines  .  Each  element  can  be  up  to 
200  characters  long,  and  therefore  can  be  used  for  several 
"pieces"  of  data,  if  desired;  for  example,  one  could  combine 
a  title  with  added  subject  keywords.  The  use  of  three  fields 
allows  the  user  to  distinguish  between  elements,  but  does 
not  complicate  input  greatly.  The  program  stores  data  in  a 
two-dimensional  array,  A  (j,k)  .  The  index  j  identifies  a 
particular  record  and  runs  from  1  to  500,  the  maximum  number 
of  records;  k  runs  from  1  to  3,  corresponding  to  the  three 
fields. 

I  am  assuming  that  a  personal  collection  larger  than 
five  hundred  items  can  be  subdivided  in  some  way.  The 
separation  may  be  according  to  major  subject,  or  it  may  be 
by  something  as  arbitrary  as  date.  In  either  case,  the 
researcher  will  then  have  two  files  to  search;  he  should  be 
able  to  determine  which  would  be  the  best  to  search.  This 
assumption  is  supported  by  Engelbart21,  when  he  describes 
himself  as  using  exactly  this  sort  of  subdivision  to 
minimise  the  number  of  edge-notched  cards  to  be  searched  at 
one  time. 


21Douglas  C.  Engelbart.  "Special  considerations  of  the 
individual  as  a  user,  generator,  and  retriever  of 
information."  American  Documentation,  12  (2) ; 1 21-125,  April 
1966. 
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The  use  of  three  fields  rather  than  one  large  one  helps 
minimise  the  amount  of  data  to  be  searched,  since  the 
program  scans  the  file  sequentially.  This  keeps  the  cost  of 
the  search  down,  while  at  the  same  time  the  input  format  is 
not  complicated.  Unlike  SPIRES,  no  cpu  time  is  used  to 
process  the  data  into  inverted  files.  Storage  is  also  kept 
to  a  minimum,  since  only  the  compiled  program  and  data  must 
be  stored  on  disk. 

Running  the  program  is  very  simple,  as  can  be  seen  from 
Appendix  III.  The  searcher  signs  on,  and  issues  the  command 
"$run  search+*pl llib  par=data=f ilenamea)u  (200)  ",  where 
"filename”  is  the  name  of  the  file  in  which  the  data  is 
stored.  The  program  first  fills  the  data  array  from  the 
external  file  in  which  the  data  is  stored.  Then,  the 
searcher  is  asked  to  enter  the  maximum  number  of  items  he 
would  like  retrieved.  I  assume  that  only  a  few  would  be 
retrieved  if  the  program  is  run  in  on-line  mode,  and  that 
for  a  large  search,  batch  mode  would  be  used.  Then,  the 
searcher  is  asked  to  input  the  threshold  weight  and  number 
of  terms  to  be  searched.  Next,  the  searcher  enters  the  type 
of  search,  using  the  numbers  1,  2,  or  3  to  indicate  which 
element  is  to  be  searched.  The  search  term  is  then  entered, 
again  using  the  apostrophe  as  a  delimiter,  along  with  its 
weight.  This  is  repeated  for  each  term.  Then,  the  computer 
carries  out  the  search,  scanning  the  element  for  the 
occurrence  of  each  search  term.  If  a  term  occurs,  the 
document  weight  is  increased  by  the  weight  of  the  term,  and 
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if  the  document  weight  is  greater  than  the  threshold  weight, 
the  record  is  printed.  When  either  the  maximum  number  of 
items  has  been  retrieved,  or  the  end  of  the  data  is  reached, 
the  searcher  may  continue  with  another  search,  or  he  can 
terminate  the  session  by  entering  a  0  (zero) . 

New  records  are  added  to  the  system  by  using  the  file 
editor  capability  of  the  operating  system,  and  are 
immediately  searchable.  Changes  are  made  in  the  same  way.  A 
record  can  be  removed  from  the  file  simply  by  deleting  the 
lines  which  apply  to  it. 

One  possible  problem  with  this  system  is  due  to  its 
simplicity.  Since  the  file  is  searched  sequentially,  items 
are  retrieved  in  the  order  in  which  they  were  input.  A 
better  method  would  be  to  arrange  them  in  order  of 
decreasing  document  weight,  but  this  would  complicate  the 
program,  and  also  increase  the  cost  of  searching,  as  one 
would  have  to  store  all  retrieved  records,  prior  to  sorting 
by  document  weight.  In  principle,  this  could  mean  the 
addition  of  another  500  x  3  array. 

It  car.  be  seen,  though,  that  this  program  does  provide 
an  alternative  to  manual  systems,  by  using  the  computer  to 
minimise  the  more  routine,  clerical  aspects  of  creating  and 
maintaining  a  retrieval  system,  while  at  the  same  time 
avoiding  the  complexity  of  a  data  base  management  system. 


.  (07  ?S) 

' 


. 

< 

* 


1  i  ...  .1  ->  J  >  > 


Jr 

. 

. 

< 


VI.  SUMMARY 


Manual  systems,  data  base  management  systems,  and  an 
alternative  weighted-term  system  have  been  analysed  in  terms 
of  the  criteria  identified  in  Chapter  II.  The  following 
section  roughly  compares  them  in  terms  of  cost  and  time 
needed  to  set  up  and  use  the  system. 

1.  Cost  of  establishing 

Uniterm  <  edge-notched  cards  <  optical  coincidence  < 
weighted-term  <  SPIRES 

In  the  manual  systems,  the  cost  is  mainly  for 
materials,  and  will  range  from  a  few  dollars  for  simple 
Uniterm  cards  to  several  hundred  or  more  for  an  optical 
coincidence  system.  The  cost  of  the  weighted-term  system 
is  for  disk  storage:  a  file  of  200  records  will  cost 
approximately  $50.00  per  year.  SPIRES  records  must  be 
processed  prior  to  storing,  and  this  increases  the  cost 
to  some  $0.50  per  record,  or  $200  annually  for  200 
items. 

2.  Time  to  set  up 

weighted-term  <  edge-notched  cards  <  Uniterm,  optical 
coincidence  <  SPIRES 

3.  Cost  to  search 

SPIRES  <  weighted-term  (not  applicable  to  manual 
systems) 

Searching  for  four  terms  in  a  200  record  file  using 
the  weighted-term  system  would  cost  about  $1.50;  the 
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same  search  would  cost  half  that  in  SPIRES. 

4.  Time  to  search 

SPIRES  <  weighted- term  <  optical  coincidence  <  Uniterm  < 
edge-notched  cards 

From  this,  it  is  obvious  that  the  cost  and  time 
involved  in  each  system  varies,  and  the  researcher 
should  choose  the  one  that  best  suits  his  situation. 
Manual  systems  generally  cost  less,  but  are  more 
time-consuming  to  search,  while  automated  systems  have  a 
high  initial  cost,  but  are  more  easily  searched.  The 
weighted-term  system  described  in  this  thesis  presents 
an  alternative  to  both  manual  and  data  base  management 
systems,  since  it  reguires  little  time  to  set  up  and 
search,  and  costs  are  not  prohibitive. 


fV 


' 


. 


- 

. 


28 


BIBLIOGRAPHY 

Artandi,  Susan.  An  introduction  to  computers  in  information 
science.  2nd  ed.  Metuchen:  Scarecrow  Press,  1972. 

Bannur,  B.  B.  and  G.  M .  Purandare.  "Personal  documentation  - 
a  study."  Annals  of  Library  Science  and  Documentation, 

14 (4) : 182-186,  December  1967. 

Bourne,  Charles  P.  Methods  of  information  handling.  New 
York:  John  Wiley  &  Sons,  1963. 

Bridges,  Kent  W.  "Automatic  indexing  of  personal 

bibliographies."  Bioscience,  20(2):94-97,  January  15, 
1970. 

Burton,  Hilary  D.  "Personal  documentation  methods  and 
practices  with  analysis  of  their  relation  to  formal 
bibliographic  systems."  Microform  Ph.D.  thesis. 
University  of  California,  1972.  1  microfilm,  positive. 

_ .  "Personal  information  systems:  implications  for 

libraries."  Special  Libraries,  64(1):7-11,  January  1973. 

_  and  Theodor  B.  Yerke.  "FAMULUS:  a  computer-based 

system  for  augmenting  personal  documentation  efforts." 

In  American  Society  for  Information  Science.  Cooperating 
Information  Societies.  Proceedings  of  the  annual 
meeting.  6:53-56,  1969. 

Cushing,  Ralph.  "A  fresh  look  at  improving  personal  filing 
systems."  Chemical  Engineering,  70(1): 73-88,  January  7, 
1968. 

Davis,  Charles  H.  Illustrative  computer  programming  for 


. 

* 


O  »6*f<  ih  •*! 


■ 


: (f ) Pd  %  i 


.  ? 


. 


.  ,  %-s<  : 


■ 


libraries:  selected  examples  for  information 
specialists.  Westport:  Greenwood  Press,  c1974. 


29 


_ .  An  information  retrieval  primer.  Ann  Arbor,  1975. 

paper. 

_ .  “A  simple  program  for  weighted-term  searching. " 

Special  Libraries,  63  (9) : 381-384,  September  1972. 

Engelbart,  Douglas  C.  "Special  considerations  for  the 
individual  as  a  user,  generator,  and  retriever  of 
information."  American  Documentation,  12  (2)  : 121-125, 
April  1961. 

Feinberg,  Hilda.  Title  derivative  indexing  techniques. 
Metuchen:  Scarecrow  Press,  1973. 

Foskett,  A.  C.  A  guide  to  personal  indexes  using 

edge-notched,  Uniterm  and  peek-a-boo  cards.  2nd  ed. 
London:  Clive  Bingley,  c1970. 

Heaps,  Doreen  M.  and  Paul  Sorenson.  "An  on-line  personal 
documentation  system."  In  American  Society  for 
Information  Science.  Information  transfer.  Proceedings 
of  the  annual  meeting,  5:201-207,  1968. 

Hoff,  Wilbur  J.  "A  retrieval  system  for  health  education 

information."  International  Journal  of  Health  Education, 
9  (2)  : 87-93 ,  June  1966. 

Jackson,  G.  E.  SPIRES/370  file  defintion.  Edmonton:  The 
University  of  Alberta,  1977. 

Jahcda,  Gerald.  Information  storage  and  retrieval  systems 
for  individual  researchers.  New  York: 


Wiley-Inter science ,  1970. 


. 


'•  , 


' 


»auoo  .  " . u 


.F4< 


.  .  X  -  ,  •  •  • 


-■  ) 


.  Jtiafc 

i- 

' 


. 


■ 


30 


_ .  "A  technique  for  determining  index  requirements." 

American  Documentation,  1 5 (2) : 82-85,  April  1964. 

_ ,  E.  D.  Hutchins,  and  E.  E.  Galford.  "Analysis  of 

case  histories  of  personal  index  use."  In  American 
Documentation  Institute.  Proceedings  of  the  annual 
meeting,  4:245-254,  1966. 

_ , _ ,  and _ .  "Characteristics  and  use 

of  personal  indexes  maintained  by  scientists  and 
engineers  in  one  university."  American  Documentation, 
17(2)  : 7 1-75,  April  1966. 

Jameson,  David  L.  "Information  retrieval  for  the  working 
scientist:  a  simple  algorithm."  Bioscience, 

19 (3) :232-233,  March  1969. 

Kent,  Allen.  Information  analysis  and  retrieval.  New  York: 
Becker  and  Hayes,  c1971. 

Kochen,  Manfred.  Principles  of  information  retrieval.  Los 
Angeles:  Melville  Publishing  Company,  c1974. 

Krieger,  K.  A.  "A  punched  card  system  for  chemical 
literature."  Journal  of  Chemical  Documentation, 

26  (3)  :  163-166,  March  1949. 

Lancaster,  F.  Wilfred. Inf ormation  retrieval  systems: 

characteristics,  testing,  and  evaluation.  New  York:  John 
Wiley  &  Sons,  c1968. 

_  and  E.  G.  Fayen.  Information  retrieval  on-line.  Los 

Angeles:  Melville  Publishing  Company,  c1973. 

Leggate,  Peter,  B.  M.  Eaglestone,  E.  M.  Jarman,  M.  M. 

Norgett,  and  A.  P.  Williams.  "An  on-line  system  for 


. 


■ 


"  . 


' 


* 

* 

■ 

. ijoi  j  f»eflf»v  .»  Sas  ,n;r i-ta  * 1 

*3,J>-fvD,  ,  tK 


* 


r 


31 


handling  personal  data  bases  on  a  PDP  11/20 
minicomputer."  Aslib  Proceedings,  29  (2):56-66,  February 
1977. 

Linn,  William  E. ,  Jr.,  and  Waiter  Reitman.  "Referential 
communication  in  AUTONOTE,  a  personal  information 
retrieval  system."  In  Association  for  Computing 
Machinery.  Proceedings  of  the  annual  conference,  p. 
67-82,  1971. 

Loosjes,  Th.  P.  On  documentation  of  scientific  literature. 
London:  Butterworths ,  1973. 

McPherson,  Arlean.  "Computer  indexing  with  S.I.S.  II  and 
FAMULUS."  In  American  Society  for  Information  Science. 
Proceedings:  third  annual  meeting,  Eanff,  October  3,4,5, 
1971.  Calgary:  Information  Systems,  1971.  p.  121-128. 

Marshall,  K.  E.  "The  evolution  of  a  storage  and  retrieval 
system  for  indexes  and  annotated  bibliographic 
references."  In  American  Society  for  Information 
Science.  Proceedings:  third  annual  meeting,  Banff, 
October  3,4,5,  1971.  Calgary:  Information  Systems,  1971. 
p.  73-80. 

Meadow,  Charles  T.  The  analysis  of  information  systems.  2nd 
ed.  Los  Angeles:  Melville  Publishing  Company,  c1973. 

Mittman,  Benjamin  and  Lorraine  Borman.  Personalized  data 

base  systems.  Los  Angeles:  Melville  Publishing  Company, 
1975. 

_  and  Gilbert  Krulee.  "Development  of  a  remote 

information  management  system  -  RIMS."  In  American 


. 


' 


. 


. 


r 


■ 

.a5«io6  90&4130  ; 


. 


32 


Society  for  Information  Science.  Cooperating  information 
societies.  Proceedings  of  the  annual  meeting.  6:199-206, 
1969. 

Norton,  John  H.  "Setting  up  a  personal  information  retrieval 
system."  Management  Peview,  59(5):2-9,  March  1970. 

Vickery,  Brian  Campbell.  Classification  and  indexing  in 
science.  3rd  ed.  London:  Butterwor ths,  1975. 

Wallace,  Everett  M.  "Experience  with  EDP  support  of 
individuals  file  maintenance."  In  American 
Documentation  Institute.  Proceedings  of  the  annual 
meeting.  1:259-261,  1964. 

Wilkinson,  William  M.  "Indexing  a  personal  reference  file." 
Special  Libraries,  50(1) : 1 6— 18,  January  1959. 

Yerke,  Theodor  B.  "Computer  support  of  the  researcher's  own 
documentation."  Datamation,  16  (2):  75-78,  February  1970. 


.e  m>' 


. 


* 


s  <>M  3.«a  i  z*  -  -ptf  o*4  l 


i 


. 


►  - 


■ 


mm  m  k  ""  ■  - 


33 


APPENDIX  I: 


fr~i~ 

*-~4 

H  v 

03  M 

.3 

a  h 

13  Cm 

H  C 
cm  ra 
c 

—  Q 


fc.  O 


DC i 

in 

% 

C 

<3 

> 

w 

M 

♦  Ok 

Pi 

ra 

H 

ra 

O 

IH 

p 

C 

o 

ra 

EH 

CM 

X 

«  » 

to 

v^ 

Ph 

M 

*  CO 

ra 

as 

Cm 

r^H 

o 

O 

<=1! 

S> 

ra 

*  to 

H 

CC 

X 

ra 

« 

O 

^-v 

w 

OQ 

O 

i* 

o 

p 

‘wtfJ 

ra 

*  **  p[ 

o 

ra 

ra 

o 

H 

- - S  ^s-«,  H 

t — 

H 

X 

ra 

to 

*>H 

ra  0")  CM 

ra 

H 

H 

ra 

EH 

pH 

ra 

o 

ra 

PS 

crap 

« 

o 

E^H 

ic  <»  co 

i—* 

ra 

*-*w 

\ 

X 

tj 

X 

CM 

•.p*. 

•>f 

C* 

CO  03  H 

M 

•ra 

o 

EH 

CS3 

Jtw  ^  l-M 

Pm 

ra 

EH 

C 

ISI 

c  o 

o 

ra 

to 

Cm 

P 

(S3 

ra  o 

in 

ra 

Ik 

ra 

' — 

«» 

H  lO  O 

S— 

to 

X 

ra 

M 

II 

♦  r 

C3  '-^in 

£3 

ra 

c 

> 

*  to 

ra 

v— 

O  C  '-" 

Ph 

«-M 
*  —i 

f—4 

o 

H 

^ - 

+ 

W  {H 

M 

— M 

• 

o 

II 

«  c. 

ra 

*•***? 

ra 

u  ra 

bi 

u 

ra 

o 

H 

cq 

ra 

O  ra 

T~. 

II 

c-r-4 

(H 

ii 

CM  U 

ra 

ra 

u 

•Jf 

ra 

Ph 

ra 

ra  o 

o 

ra 

ra 

/ 

G 

hH 

ra 

if  •  t 


as 

ra 

ra 

ca 

H 

c 

Pm 

w 

Vi 

cograia  listing 


n 

*p» 

CO 

(H 

HH  fe» 

as  •» 

*t>» 

Pm  CC 

O  U 

PS 

D3  C 

CO  W 

«  to 

5  p 

r<~4  r-< 

(3 


P  O 


\ 

«  to 

* 

ra 

«• 

«% 

ra 

ra 

X 

*<r~' 

<! 

EH 

o 

c 

EH 

v,y» 

P 

13 

ra 

ra 

Qi 

EH 

Q 

. 

C 

ra 

W 

«< 

O 

CM 

w 

ra 

Em 

EM 

H 

Q 

oo 

#25 

P 

pH 

o 

*  » 

C3 

ra 

W 

U5 

✓**V 

as 

it* 

N 

•  to 

H 

u 

S3 

**“=»N 

ra 

w 

as 

EH 

13 

».  H 
r*N 

PO 

*  «* 

Q 

c 

H 

as 

O 

»< 

a: 

ro 

II 

•  » 

w 

P 

£M 

50 

ra. 

ra 

to 

M 

fR^C 

ra 

O 

M 

ra 

•  » 

r— 

*  to 

o 

EH 

ra 

ra 

ra 

ra 

II 

fx^ 

ra 

II 

ra 

Cm 

ra 

i — i 

•  r« 

ra 

ra 

uq 

ra 

p 

X 

r~ 

►ra~t 

c 

ra 

ra 

S3 

H 

ra 

ra 

<3 

|l 

W 

O 

EH 

to 

rXl 

to 

P 

•p' 

to 

ra 

J5-1 

W 

c 

♦  to 

EM 

ra 

P 

cq 

ra 

P 

o 

EH 

Em 

ra 

ra 

o 

ra 

O 

# 

P 

P 

Pm 

p 

M 

Cm 

p 

w 

o 

N 

ra 

ra 

ra 

O 

H 

EH 

CM 

C 

EH 

to 


* 


.  ' 


PUT  SKIP  EDIT ( *  ENT EE  THRESHOLD  WEIGHT*)  (A); 
PUT  SKIP; 


34 


'I 


S 


t 


*•  <’•*> 


H 

CM 

■S ««* 

CO 

*  p 

*  * 

«w 

^T~» 

I'H 

EH 

w 

ts. 

tH 

££ 

H 

03 

•j. 

O 

CM 

CO 

<g 

II 

CU 

Eh 

r 

<» 

to 

O 

CO 

to  CO 

"■"’’•to 

EH 

s 

««»■. 

' - ' 

•  *> 

M 

03 

to 

Cm 

to»— u 

•—** 

O 

KJ 

to 

* 

to 

S3 

CD 

H 

H 

H 

cm 

to 

b 

w 

toC-* 

CM 

to 

(H 

EH 

W 

O 

w 

CN 

£S 

to 

Wv4 

H 

tk 

toto. 

Eh 

« 

CM 

w 

X 

K. 

M 

H 

O 

CQ 

H 

<g 

S^r 

tototo 

II 

2a 

W 

23 

w 

r 

CD 

CQ 

fcU 

CU 

•  » 

EH 

a 

P 

w 

X 

to— to 

PI 

EH 

«*• 

H 

EH 

M 

a 

PS 

zs 

*• 

to 

»w* 

w 

W 

w 

u 

EH 

«o 

9  * 

H 

t* 

«• 

H 

M 

Lc£ 

T*+ 

•  e> 

H 

to** 

«4" 

tc 

Q 

cm 

tH 

b 

nq 

rf! 

IH 

CO 

«2C 

M— « 

m 

H 

rnn 

CM 

•  9r 

Ss 

w 

w 

CO 

Q 

>« 

»  CU 

X 

X 

a 

CM 

w 

H 

W 

EH 

H 

u 

w 

ii 

03 

o 

CM 

M 

.  * 

«  *• 

n3  •  ** 

if 

e  *- 

o 

Q 

EH 

Eh 

H 

re 

Q 

. _ , 

X 

CU 

O  P4 

Eh 

H 

fs 

25 

l| 

M 

X 

M  H< 

CO 

ro 

c 

H 

o 

A 

o 

*  * 

,W'« 

o 

NS 

W  i*i 

H 

£h 

.to 

T— 

EH 

CD 

EH 

CU 

CU 

H 

H 

CO 

P*  (/) 

CU 

o 

II 

Cm 

pe 

co 

eh 

M 

co 

EH 

'T" 

II 

H 

bH 

•  *> 

P 

w 

KM 

03 

iM 

H 

r*" 

Eh 

H 

£H 

II 

H 

Q 

CO 

P 

CO 

CO 

P 

II 

CD 

to-/ 

w 

01 

u 

U* 

o 

zs 

CM 

p 

H 

CU 

CU 

CD  .» 

*  vi  rM 

Q 

Q 

w 

H 

w 

HHHH  Q  H  O 

W  D  D  W  O  CD  II  O 

<3  P<  Pi  o  Q  W  PU 


■ 


■ 

.  ... 


. 


35 


Ik 


25 

I) 


r* 


«* 

iQ 

*s 

«  ©» 

H 

k. 

u 

OT 

w 

1 

s* 

<5 

K~ 

Ik 

*»»* 

| 

H 

05 

Q 

at 

H 

«t 

Q 

to 

*•"* 

Ik 

H 

CQ 

% 

«. 

Z 

23 

******  «  ** 

CO 

*  «- 

to 

H  .*-» 

II 

**»«• 

k*  r— 

£h 

o 

pq 

o  +■ 

EH 

23 

(N 

*Q 

tH  4-3 

-r* 

0< 

<p“ 

M 

' — 1  <k 

cd 

A 

23 

*=5  **-•«. 

H 

23 

I^X 

as 

*'-'  H 

W 

W 

***** 

05  n> 

» 

23 

M 

r— 

eh  U 

EH 

Ik 

( 

tO  H 

fcH 

u 

CQ  w 

25 

r~ 

H 

pH 

23  <£, 

to 

(N 

•  «»■ 

*Q* 

CQ 

to  *—' 

E3 

T** 

*=5 

*"«•*  05 

23 

V 

f=d 

k-*lf 

r~ 

eh  eh 

u 

23 

H  tO 

o 

»*r** 

EH 

o 

Q  CQ 

Q 

M 

CD 

[H  •  • 

CQ  23 

•*> 

«» 

25 

to 

V=^l* 

u 

M 

W 

o  ***■ 

Ot  || 

H 

H 

«!> 

Ol 

CO 

EH  s 

H 

u 

'— " 

r“  *» 

E5  fH 

« 

•  *> 

*4 

H 

to 

M  II 

to  * 

CQ 

w 

tH 

*q  r 

•  »  u 

t  o 

23 

< 

M 

QEi  H 

•*  rn 

EH 

23 

o 

^  23 

CQ 

o 

fcH  O 

2i 

Q 

W  CH  C  •» 

V* 

n  •*  o 

25 

M  Q 

Q 

04 

^  PU  H 

cq 

Q 

o 

H 

fc  m 

►q 

W  W 

Q 

W 

E4  *>  Z  CO 


CO  «C  CO  I!  Cm  (-4 

’***  H  H  W 

H  Eh 

23  2)  O 

C-i  Q(  Q 


«S 


H 

k> 

•  ► 

u 

Cl 

M 

23 

w 

M 

O 

Eh 

Eh 

H 

Q 

O 

W 

a 

CH 

25 

H 

cq 

34 

K 

CO  „ 

Eh 

tH 

XI 

23 

O 4  «  *i 

S3 

Q 

23 

25 

II 

W 

*  * 

S3 

Q 

23 

Pm 

H 

A"’* 

«* 

ft 


*> 

25 

V 


II 

Q 

23 

23 

O 

pH 


pH 

,A 

CQ 

*$ 

CQ 

O 

M 

CU 

CO 

M 

W 

S3 

M 

05 

cu 

•aj 


05 

O 


to  t4 

S3  25 

Cq  <5 

Eh  *q 

PH  CQ 


CM  fSSj 

Q 


05 

CQ 

CQ 

S3 

23 

23 


EH 

H 

Q  *  *■ 

W 

<C 


PO  ^ 

*•-■*  r~i 

Of  ■—* 

H  Pm  » « 

E4  *>  r~ 

tO  <5  l| 

» •>  '■“*  Qj 

Q  EH  S3 

25  23  23 

tq  04  f-o 


t 

05 

O 

05 

«  «fc 

05 

. - , 

w 

m 

^4 

>*** 

Eh 

'NW* 

bH 

a 

cq 

«• 

* 

9 

Eh 

0"m% 

CD 

9  e. 

05 

m 

z 

33 

t< 

***** 

EH 

u 

Eh 

04 

to 

05 

CO 

H 

CO 

<cd 

55 

M 

03 

o 

to 

S3 

CO 

EH 

H 

ii* 

Q 

O 

23 

K^H 

CD 

04 

w 

M 

If  I  * 

** 

•  * 

H 

r-  Q 

05 

EH 

25 

«  z 

a 

H 

H 

z  cq 

05 

23 

05 

cq 

03 

Oi 

Pu 

CQ 

' 


1 


*v 


' 


■  8  ■ 


36 


APPENDIX  II:  Sample  data 


•ARTANDI,  S.  1 

’AN  INTRODUCTION  TO  COMPUTERS  IN  INFORMATION  SCIENCE.’ 

’ 1972.  Z  699  A72  1972’ 

'BANNUR,  B.B~  AND  G-H.  PURANDARE* 

’PERSONAL  DOCUMENTATION  -  A  STUDY.’ 

'ANN.  LIB.  SCI.  DOC.  14  (4)  :  182-286,  DECEMBER  1967* 

’BOURNE,  C.H.* 

'METHODS  OF  INFORMATION  HANDLING. 8 

*  1963.  Z  699  B77  8 
’BRIDGES,  K- ’ 

•AUTOMATIC  INDEXING  OF  PERSONAL  BIBLIOGRAPHIES.’ 

*  BIOS  Cl EN  CE,  2  0  (2)  :  94-9  7 ,  JANUARY  15,  1970.’ 

'BURTON,  H.D.  * 

•PERSONAL  DOCUMENTATION  METHODS  AND  PRACTICES  KITH  ANALYSIS 
OF  THEIR  RELATION  TO  FORMAL  BIBLIOGRAPHIC  SYSTEMS. f 
•PH-D.  THESIS.  1972’ 

’BURTON,  H.D.’ 

’PERSONAL  INFORMATION  SYSTEMS:  IMPLICATIONS  FOR  LIBRARIES.* 
•SPEC.  LIB.,  64(1)7-11,  JANUARY  1973.’ 

•CUSHING,  R . * 

•A  FRESH  LOCK  AT  IMPROVING  PERSONAL  FILING  SYSTEMS.* 

*  CH  EM.  ENG.,  70{1):  73-88,  JAN. 7,  1968.’ 

’DAVIS,  C.H.* 

•ILLUSTRATIVE  COMPUTER  PROGRAMMING  FOR  LIBRARIES:  SELECTED 
EXAMPLES  FOR  INFORMATION  SPECIALISTS.  » 

*  1974  • 

’DAVIS,  C-  H.  5 

'AN  INFORMATION  RETRIEVAL  PRIMER.* 

*  1975 ’ 

•DAVIS,  C . H . * 

•A  SIMPLE  PROGRAM  FOR  WEIGHTED-TERM  SEARCHING.  * 

•SPEC.  LIB.,  63  (9)  :  381  =  384,  SEPT.  1972.’ 

’ ENGELB ART ,  D. C, * 

•SPECIAL  CONSIDERATIONS  OF  THE  INDIVIDUAL  AS  A  USER, 
GENERATOR,  AND  RETRIEVER  CP  INFORMATION.* 

•AM.  DOC,  12  (2)  :  121-125,  APRIL  1961.* 

•FEINBERG,  H.  » 

•TITLE  DERIVATIVE  INDEXING  TECHNIQUES.* 

*1973.  Z695.92  F29  1973’ 

•FOSKETT,  A.  II.  • 

•A  GUIDE  TO  PERSONAL  INDEXES  USING  EDGE-NOTCHED,  UNITERM, 

AND  EEEK-A- EOO  CARDS.* 

« 1970.  Z  695. 9  F743  1970* 
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'HEAPS,  D. H.  AND  P.  SORENSON.' 

•AN  ON-LINE  PERSONAL  DOCUMENTATION. ' 

' ASIS .  PROCEEDINGS ,  VOL. 5:201 -207.  1968.' 

•HOFF,  K.J.* 

•A  RETRIEVAL  SYSTEM  FOR  HEALTH  EDUCATION  INFORMATION. * 

'INI.  J.  HEALTH  ED.,  9(2); 87-93,  JUNE  1966' 

'JAHODA,  G.» 

‘INFORMATION  STORAGE  AND  RETRIEVAL  SYSTEMS  FOR  INDIVIDUAL 
RESEARCHERS' 

*1970' 

'  JAHODA,  G.  * 

'A  TECHNIQUE  FOR  DETERMINING  INDEX  REQUIREMENTS.' 

'AM,  DOC.,  15  (2): 82-85,  APR.  1964. * 

'JAHODA,  G.,  B-D.  HUTCHINS,  AND  R.E.  GALFOBD. 5 

*  CHARACTERISTICS  AND  USE  OF  PERSONAL  INDEXES  MAINTAINED  BY 
SCIENTISTS  AND  ENGINEERS  IN  ONE  UNIVERSITY.' 

'AM,  DOC.,  17  (2): 7  1-75,  APR,  19  66. * 

'JAHODA,  G , ,  R.D.  HUTCHINS,  AND  B.B.  GALFOBD. 5 
•ANALYSIS  OF  CASE  HISTORIES  OF  PERSONAL  INDEX  USE,* 

'ADI.  PROCEEDINGS,  4:245=254,  1966-' 

'JAMESON,  D.L. ' 

•INFORMATION  RETRIEVAL  FOR  THE  WORKING  SCIENTIST;  A  SIMPLE 
ALGORITHM » 

'BIOSCIENCE,  19  (3)  : 232-233,  MAR.  1969, * 

•KENT,  A.* 

•INFORMATION  ANALYSIS  AND  RETRIEVAL, * 

•  1971,  Z  699  K365 1 
'LOCSJES,  TH. P. * 

'ON  DOCUMENTATION  OF  SCIENTIFIC  L1TEE1ATURE.  * 

» 1973.  Z  699  L 86  1973' 
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